Random Forests for Spatially Dependent Data

نویسندگان

چکیده

Spatial linear mixed-models, consisting of a covariate effect and Gaussian process (GP) distributed spatial random effect, are widely used for analyses geospatial data. We consider the setting where is nonlinear. Random forests (RF) popular estimating nonlinear functions but applications RF data have often ignored correlation. show that this impacts performance adversely. propose RF-GLS, novel well-principled extension RF, effects in mixed models correlation modeled using GP. RF-GLS extends same way generalized least squares (GLS) fundamentally ordinary (OLS) to accommodate dependence models. becomes special case substantially outperformed by both estimation prediction across extensive numerical experiments with spatially correlated can be functional other types dependent like time series. prove consistency ?-mixing error processes include Matérn As byproduct, we also establish, our knowledge, first result under dependence. establish results independent importance, including general GLS optimizers data-driven function classes, uniform law large number weaker assumptions. These new tools potentially useful asymptotic analysis GLS-style estimators nonparametric regression

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spatially Coherent Random Forests

Spatially Coherent Random Forest (SCRF) extends Random Forest to create spatially coherent labeling. Each split function in SCRF is evaluated based on a traditional information gain measure that is regularized by a spatial coherency term. This way, SCRF is encouraged to choose split functions that cluster pixels both in appearance space and in image space. In particular, we use SCRF to detect c...

متن کامل

Bayesian Melding of Deterministic Models and Kriging for Analysis of Spatially Dependent Data

The link between geographic information systems and decision making approach own the invention and development of spatial data melding method. These methods combine different data sets, to achieve better results. In this paper, the Bayesian melding method for combining the measurements and outputs of deterministic models and kriging are considered. Then the ozone data in Tehran city are analyze...

متن کامل

Random Forests for Big Data

Big Data is one of the major challenges of statistical science and has numerous consequences from algorithmic and theoretical viewpoints. Big Data always involve massive data but they also often include data streams and data heterogeneity. Recently some statistical methods have been adapted to process Big Data, like linear regression models, clustering methods and bootstrapping schemes. Based o...

متن کامل

Random survival forests for high-dimensional data

Minimal depth is a dimensionless order statistic that measures the predictiveness of a variable in a survival tree. It can be used to select variables in high-dimensional problems using Random Survival Forests (RSF), a new extension of Breiman’s Random Forests (RF) to survival settings. We review this methodology and demonstrate its use in high-dimensional survival problems using a public domai...

متن کامل

Context-dependent feature analysis with random forests

In many cases, feature selection is often more complicated than identifying a single subset of input variables that would together explain the output. There may be interactions that depend on contextual information, i.e., variables that reveal to be relevant only in some specific circumstances. In this setting, the contribution of this paper is to extend the random forest variable importances f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of the American Statistical Association

سال: 2021

ISSN: ['0162-1459', '1537-274X', '2326-6228', '1522-5445']

DOI: https://doi.org/10.1080/01621459.2021.1950003